MemGPT 源码阅读

2025年2月11日 · 阅读需 29 分钟

1. 整体架构概览

1.1 MemGPT的核心模块及其职责

graph LR
    A[agent] --> B[agent_store]
    A --> C[data_types]
    A --> D[embeddings]
    A --> E[functions]
    A --> F[memory]
    A --> G[models]
    A --> H[persistence_manager] 
    A --> I[prompts]
    A --> J[server]

agent: Agent的核心逻辑,包括初始化、处理消息、状态管理等
agent_store: Agent状态的持久化存储,支持不同的后端(如PostgreSQL、Chroma等)
data_types: MemGPT使用的核心数据结构,如AgentState、Message、Passage等
embeddings: 封装embedding相关逻辑,如文本分块、向量化等
functions: 自定义函数的定义、注册与调用
memory: 包括CoreMemory、RecallMemory和ArchivalMemory,分别对应不同层次的记忆
models: 定义了与LLM交互的请求与响应格式
persistence_manager: 管理Agent状态在不同存储后端间的持久化
prompts: 包含了一些默认的prompt模板
server: 提供了将Agent封装为RESTful API服务的能力

1.2 模块交互与数据流

sequenceDiagram
    participant C as Client
    participant S as Server
    participant A as Agent
    participant M as Memory
    participant P as PersistenceManager
    participant L as LLM
    
    C->>S: 发送消息
    S->>A: 将消息传递给Agent
    A->>M: 查询/更新 Memory
    A->>P: 持久化状态变更
    A->>L: 发送消息序列,请求LLM响应
    L-->>A: 返回LLM响应
    A->>A: 解析LLM响应(含函数调用)
    A-->>S: 返回最终回复
    S-->>C: 返回最终回复

Client通过Server将消息发送给Agent
Agent查询/更新内存,并通过PersistenceManager持久化状态变更
Agent将消息历史发送给LLM,请求下一步响应
LLM返回响应,其中可能包含函数调用
Agent解析LLM响应,必要时进行函数调用,最终生成回复
Agent通过Server将最终回复返回给Client

2. Agent的状态管理

2.1 Agent类的设计与实现

Agent类(agent.py)是MemGPT的核心,它封装了一个Agent的完整状态和交互逻辑。其主要组成部分包括:

config: Agent的配置,如LLM/Embedding设置、所使用的memory/function等
memory: Agent的核心记忆,包括persona和human部分
functions: Agent可调用的函数定义
messages: Agent的历史消息记录

Agent类的核心方法是step(),它接受一条用户消息,经过一系列处理步骤,最终生成Agent的回复:

将用户消息添加到messages
构造prompt(包括system、memory、functions、messages等),发送给LLM
解析LLM响应,提取函数调用和回复内容
如果存在函数调用,则执行对应的Python函数,将函数结果添加到messages
如有必要,总结messages以控制token数量
将Agent回复添加到messages
根据需要持久化状态变更(通过PersistenceManager)

2.2 PersistenceManager与存储后端

PersistenceManager抽象类(persistence_manager.py)定义了一组接口,用于管理Agent状态在不同存储后端间的持久化。

目前主要的实现是LocalStateManager,它将状态直接维护在内存中,主要用于demo/debug场景。对于生产环境,则需要实现基于数据库的PersistenceManager,将状态持久化到数据库中。

PersistenceManager的主要方法包括:

append_messages(): 添加新消息
trim_messages(): 从头部删除部分消息
swap_system_message(): 更换system消息
update_memory(): 更新memory内容

通过PersistenceManager,可以将Agent的状态变更同步到底层存储,并在Agent重启后恢复之前的状态。

2.3 AgentState等核心数据结构

AgentState(data_types.py)封装了一个Agent的完整状态,包括其配置、初始prompt、函数定义、消息历史等。AgentState主要用于在Agent和持久化存储之间传递状态。

其他一些核心数据结构还包括:

Message: 表示Agent发送或接收的一条消息,包括role(system/assistant/user)、content、function_call等字段
Passage: 表示一段用于语义检索的文本片段,包括text、embedding等字段
LLMConfig/EmbeddingConfig: 封装LLM和Embedding的配置,如model、api_key等

2.4 Agent运行过程中的状态变迁

下面以一个典型的场景为例,展示Agent运行过程中的状态变迁:

stateDiagram-v2
    [*] --> Idle: 创建Agent
    Idle --> WaitingForUserMessage: 等待用户消息
    WaitingForUserMessage --> ProcessingUserMessage: 收到用户消息
    ProcessingUserMessage --> WaitingForLLMResponse: 发送消息给LLM
    WaitingForLLMResponse --> ProcessingLLMResponse: 收到LLM响应
    ProcessingLLMResponse --> CallingFunction: 调用函数
    CallingFunction --> ProcessingLLMResponse: 函数执行完毕
    ProcessingLLMResponse --> GeneratingReply: 生成回复
    GeneratingReply --> PersistingState: 持久化状态变更
    PersistingState --> WaitingForUserMessage: 等待下一条用户消息

Agent创建后处于Idle状态,等待用户消息
收到用户消息后,进入ProcessingUserMessage状态,将消息添加到messages
发送消息序列给LLM,进入WaitingForLLMResponse状态
收到LLM响应后,进入ProcessingLLMResponse状态,解析响应
如果响应中包含函数调用,则进入CallingFunction状态,执行对应函数
函数执行完毕后,回到ProcessingLLMResponse状态
生成最终回复,进入GeneratingReply状态
持久化状态变更(messages、memory等),进入PersistingState状态
状态持久化完成后,回到WaitingForUserMessage状态,等待下一条用户消息

3. 记忆管理

3.1 三种类型Memory的作用与实现

MemGPT将Agent的记忆分为三个层次:

CoreMemory(memory.py): 对应Agent prompt中直接出现的memory部分,通常包含persona(Agent的自我认知)和human(用户信息)两个部分。CoreMemory直接参与对话,但受token数量限制,通常只能保留最核心的信息。
RecallMemory(memory.py): 对应Agent的历史对话记录,可以通过文本搜索和时间范围查询来检索。RecallMemory不直接出现在prompt中,但Agent可以通过conversaction_search等函数来访问其内容。RecallMemory的实现基于文档数据库如PostgreSQL。
ArchivalMemory(memory.py): 对应Agent的长期记忆,可以通过语义检索来访问。ArchivalMemory也不直接出现在prompt中,但Agent可以通过archival_memory_search等函数来访问其内容。ArchivalMemory的实现基于向量数据库如Chroma、Pinecone等,将文本片段映射为向量,支持高效的相似度检索。

在对话过程中,CoreMemory主要用于维护Agent的状态,而RecallMemory和ArchivalMemory则用于知识检索和问题回答。

3.2 向量数据库在RecallMemory与ArchivalMemory中的应用

RecallMemory和ArchivalMemory在检索时都用到了向量数据库:

对于RecallMemory,每条消息在插入时都会提取一个embedding向量,以支持后续的语义检索。检索时,输入文本也被转换为向量,然后在向量空间中寻找最相似的消息。
对于ArchivalMemory,外部知识库的文档会被切分成小段落(passage),每个段落提取一个embedding向量。当Agent需要访问这些知识时,会将问题转换为向量,在ArchivalMemory中检索出最相关的passage。

向量数据库在这里的作用,就是将非结构化的文本转换为结构化的向量表示,使得语义检索变得高效。常见的向量数据库包括Pinecone、Weaviate、Chroma等。MemGPT主要使用Chroma,但也提供了一套抽象接口StorageConnector,允许接入其他向量数据库。

3.3 CoreMemory如何参与prompt的构建

下面是一个简化版的prompt构建过程,展示了CoreMemory是如何参与其中的:

def construct_prompt(
    system: str, 
    memory: CoreMemory,
    messages: List[Message],
    functions: List[Function]    
) -> str:
    prompt = f"""
    {system}
    
    ### Memory
    <persona>
    {memory.persona}
    </persona>
    
    <human>
    {memory.human}
    </human>
    
    ### Messages
    {format_messages(messages)}
    
    ### Functions
    {format_functions(functions)}
    
    ### Response
    """
    
    return prompt

可以看到,CoreMemory中的persona和human字段,会被直接插入到prompt的Memory部分。这使得CoreMemory的内容能够直接影响Agent的行为。

而RecallMemory和ArchivalMemory则不直接出现在prompt中,它们主要通过函数调用的方式(conversaction_search、archival_memory_search等)来提供支持。

4. 函数化交互

4.1 函数的定义与自动schema生成

在MemGPT中,函数以Python函数的形式定义,位于functions目录下。例如一个简单的搜索函数:

def search_wikipedia(query: str) -> str:
    """Search Wikipedia for information."""
    search_quality_reflection = "The original query could not be satisfactorily answered by searching Wikipedia. Additional information from other sources may be needed."
    search_quality_score = 2
    
    results = wikipedia.search(query)
    
    if not results:
        return f"No results found for query: {query}. {search_quality_reflection}"
    
    try:
        summary = wikipedia.summary(results[0], sentences=3)
        search_quality_score = 4
        search_quality_reflection = "The Wikipedia search provided relevant information to help answer the original query. However, additional details or clarification may still be needed."
    except wikipedia.exceptions.PageError:
        summary = f"No page found for {results[0]}. {search_quality_reflection}"
    except wikipedia.exceptions.DisambiguationError as e:
        summary = f"Multiple pages found for {results[0]}: {e.options}. {search_quality_reflection}"
        
    return f"{summary}\nSearch quality score: {search_quality_score}\nSearch quality reflection: {search_quality_reflection}"

MemGPT会自动分析这个函数的签名,提取出输入参数(query)和返回值类型(str),生成对应的JSON Schema:

{
    "name": "search_wikipedia",
    "description": "Search Wikipedia for information.",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "Query to search Wikipedia for."
            }
        },
        "required": ["query"]
    }
}

这个Schema描述了函数的名称、描述、参数列表等信息,可以直接传给LLM,使其了解如何调用该函数。

4.2 函数的注册与调用流程

函数注册的入口是functions.py中的load_all_function_sets():

扫描functions目录,导入所有的Python文件
对每个文件,调用load_function_set()提取其中定义的函数
- 使用inspect模块分析函数签名
- 调用generate_schema()生成JSON Schema
- 将函数和Schema打包成FunctionSet
将所有FunctionSet合并成一个字典,供Agent使用

Agent在收到LLM的函数调用请求后,会经历以下步骤:

解析请求,提取函数名和参数
在functions字典中查找对应的函数定义
使用inspect模块动态调用该函数,传入解析出的参数
获取函数返回值,添加到消息流中,发送给LLM

这个过程可以简化为:

sequenceDiagram
    LLM->>Agent: 函数调用请求
    Agent->>FunctionRegistry: 查找函数定义
    FunctionRegistry-->>Agent: 返回函数定义
    Agent->>Function: 动态调用
    Function-->>Agent: 返回执行结果
    Agent->>LLM: 添加到消息流

4.3 函数在Agent交互中的作用

函数大大增强了Agent的能力,使其可以访问和操作外部信息。一些典型的应用场景包括:

知识检索: 通过archival_memory_search等函数,Agent可以从自己的知识库中检索信息来回答问题。
数据查询: Agent可以通过函数从数据库、API等外部数据源获取数据。
工具集成: Agent可以通过函数调用外部工具,如搜索引擎、计算器、翻译服务等。
环境交互: 在某些应用场景下,Agent还可以通过函数来影响其所在的环境,如控制智能家居设备等。

下面是一个函数在Agent对话中发挥作用的例子:

User: What is the capital of France?

Assistant: To find the capital of France, I will perform a Wikipedia search.
<search_quality_reflection>
The Wikipedia search should provide the needed information to answer this query directly.
</search_quality_reflection>
<search_quality_score>5</search_quality_score>

<function_call>
{
  "name": "search_wikipedia",
  "arguments": {
    "query": "capital of France"
  }
}
</function_call>

<search_result>
Paris is the capital and most populous city of France, with an estimated population of 2,165,423 residents in 2019 in an area of more than 105 square kilometres (41 square miles). Since the 17th century, Paris has been one of Europe's major centres of finance, diplomacy, commerce, fashion, gastronomy, science, and arts. The City of Paris is the centre and seat of government of the region and province of Île-de-France, or Paris Region, which has an estimated population of 12,174,880, or about 18 percent of the population of France as of 2017.
</search_result>

<search_quality_reflection>
The Wikipedia search provided a clear and direct answer to the question, giving relevant information about Paris being the capital of France along with additional details like population and significance. I believe I can answer the original query satisfactorily with this information.
</search_quality_reflection>
<search_quality_score>5</search_quality_score>

<result>
Paris is the capital of France. It is the country's most populous city, with an estimated population of over 2.1 million residents as of 2019. Paris has been a major center of finance, diplomacy, commerce, fashion, science and the arts in Europe since the 17th century. It is the seat of government for the Île-de-France region.
</result>

这个例子展示了函数如何帮助Agent回答问题:

Agent收到一个关于法国首都的问题
Agent决定通过search_wikipedia函数来查找答案
search_wikipedia函数调用Wikipedia API搜索"capital of France"
函数返回包含答案的搜索结果摘要
Agent将搜索结果添加到消息流中
基于搜索结果,Agent生成最终答案并返回给用户

可以看到,函数使Agent能够访问自己知识范围以外的信息,大大提升了其问答能力。同时,函数也为Agent提供了一种"思考"的方式,它可以在消息流中记录自己的函数调用过程和结果判断,使其推理过程更加清晰和可解释。

5. 服务化

5.1 RESTful API的设计

MemGPT提供了一套RESTful API,将Agent的各项能力以Web服务的形式暴露出来。其主要接口包括:

/agents: Agent的创建和查询
/agents/{agent_id}/config: Agent配置的查询和修改
/agents/{agent_id}/memory: Agent记忆的查询和修改
/agents/{agent_id}/messages: Agent消息的发送和查询

这些接口涵盖了对Agent的主要操作,使得Agent能够方便地集成到各种应用程序中。

API的设计遵循了RESTful的原则:

使用URL来标识资源,如/agents/{agent_id}表示一个特定的Agent
使用HTTP方法来表示操作,如GET表示查询,POST表示创建,PUT表示更新等
使用JSON作为数据交换格式
无状态,每个请求都包含了完成操作所需的所有信息

5.2 FastAPI的应用

MemGPT使用FastAPI框架来实现Web服务。FastAPI是一个基于Python 3.6+的高性能Web框架,它有以下特点:

基于标准的Python类型提示,容易编写和维护
自动生成API文档(Swagger UI和ReDoc)
高性能,可与NodeJS和Go媲美
支持异步请求处理

在MemGPT中,FastAPI主要用于:

定义API路由和处理函数。例如:

@router.post("/agents", response_model=CreateAgentResponse)
def create_agent(
    request: CreateAgentRequest, 
    user_id: UUID = Depends(get_current_user)
):
    # create agent logic here

请求参数和响应数据的序列化和验证。例如:

class CreateAgentRequest(BaseModel):
    config: Dict[str, Any] = Field(...)

class CreateAgentResponse(BaseModel):
    agent_id: UUID = Field(...)

依赖注入。例如:

async def get_current_user(token: str = Depends(oauth2_scheme)):
    # authenticate user logic here

异常处理。例如:

@app.exception_handler(AgentNotFoundError)
async def agent_not_found_handler(request: Request, exc: AgentNotFoundError):
    return JSONResponse(
        status_code=404,
        content={"message": f"Agent {exc.agent_id} not found"},
    )

通过FastAPI,MemGPT可以用较少的代码实现一个功能完善、文档完备、性能高效的Web服务。

5.3 Server核心逻辑

SyncServer类(server.py)实现了MemGPT服务的核心逻辑,主要包括:

Agent管理:
- create_agent: 创建一个新的Agent
- get_agent: 获取一个Agent的详细信息
- list_agents: 列出所有Agent
- delete_agent: 删除一个Agent
消息处理:
- process_message: 处理一条用户消息,将其发送给对应的Agent,并返回Agent的响应
- _step: Agent处理消息的核心逻辑,包括添加用户消息、发送给LLM、处理LLM响应、执行函数调用、生成最终回复等步骤
内存管理:
- get_agent_memory: 获取Agent的当前内存状态
- update_agent_memory: 更新Agent的内存
函数管理:
- list_functions: 列出所有可用的函数
- create_function: 添加一个新函数
- delete_function: 删除一个函数

SyncServer在收到请求后,会调用相应的方法来处理,并返回结果。这些方法大多调用了Agent的相关方法,可以看作是对Agent能力的一层封装。

此外,SyncServer还负责:

管理Agent的生命周期(创建、保存、加载、删除)
协调Agent与其他组件(如MemoryStore、FunctionRegistry等)的交互
处理异常情况,确保服务的稳定性

6. LLM适配

6.1 不同LLM接口的异同

主流的LLM服务在接口设计上存在一些差异,例如:

OpenAI使用messages参数传递上下文,而Anthropic使用prompt参数
OpenAI使用functions参数描述可用函数,而Anthropic使用插件系统
OpenAI使用temperature控制输出的随机性,而Anthropic使用temperature和top_k的组合

但它们在核心功能上是一致的,都支持:

传入上下文(Context)
流式响应(Streaming)
函数调用(Function Calling)

这为适配不同LLM提供了可能。

6.2 llm_api模块的适配方法

MemGPT在llm_api模块中封装了与不同LLM的交互逻辑。其中,openai.py、anthropic.py等文件分别对应了不同LLM的适配代码。

以openai.py为例,其主要方法包括:

openai_chat_completions_request: 发送一个Chat Completion请求并返回结果,对应OpenAI的/v1/chat/completions接口
openai_chat_completions_stream: 发送一个Chat Completion请求并流式返回结果,对应OpenAI的/v1/chat/completions接口的stream模式
openai_embeddings_request: 发送一个Embeddings请求并返回结果,对应OpenAI的/v1/embeddings接口

这些方法接收统一的参数(如api_key、model、messages等),将其转换为对应LLM的请求格式,发送请求,并将响应转换回统一的格式。

例如,openai_chat_completions_request的核心逻辑如下:

def openai_chat_completions_request(
    api_key: str, model: str, messages: List[dict]
) -> dict:
    url = f"{OPENAI_API_BASE}/v1/chat/completions"
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {api_key}",
    }
    data = {
        "model": model,
        "messages": messages,
    }
    
    response = requests.post(url, headers=headers, json=data)
    response_data = response.json()
    
    if response.status_code != 200:
        raise OpenAIError(response_data)
    
    return response_data

它将messages等参数按照OpenAI的要求进行封装,发送请求,并检查响应状态。如果请求失败,则抛出一个OpenAIError异常。

通过这种方式,llm_api模块屏蔽了不同LLM接口的差异,为上层提供了一致的访问方式。

6.3 ChatML在Agent中的应用

除了API适配,MemGPT还使用了一种名为ChatML的通用语言来构建Agent的Prompt。ChatML是一种基于XML的标记语言,用于描述聊天机器人的行为。

例如,一个简单的ChatML Prompt可能如下:

<chatbot>
  <description>
    Assistant is a large language model trained by Anthropic to be helpful, harmless, and honest.
  </description>
  
  <memory>
    <human_attributes>
      The human is a 25-year-old male named John. John is a software engineer who loves hiking and photography.
    </human_attributes>
    <conversation_summary>
      John is planning a hiking trip and has asked the assistant for advice on choosing a destination and packing gear.
    </conversation_summary>
  </memory>
  
  <message>
    <human>What are some good beginner-friendly hiking trails in California?</human>
    <assistant>Here are some great beginner-friendly hiking trails in California:
    
    1. Fern Canyon Trail in Prairie Creek Redwoods State Park: This easy 1-mile loop takes you through a lush canyon filled with ferns and waterfalls. It's a great introduction to the beauty of the redwood forests.
    
    2. Bumpass Hell Trail in Lassen Volcanic National Park: This 3-mile round trip hike takes you to a fascinating geothermal area with boiling pools and mud pots. The trail is well-maintained and offers stunning views of the park.
    
    3. Moro Rock Trail in Sequoia National Park: This short but steep 0.5-mile hike leads to the top of a granite dome with panoramic views of the surrounding mountains and valleys. There are stairs and handrails to assist hikers.
    
    4. Inspiration Point Trail in Yosemite National Park: This easy 2.6-mile round trip hike offers classic views of Yosemite Valley, including Half Dome and El Capitan. It's a great option for those who want to experience Yosemite's iconic scenery without a strenuous hike.
    
    Remember to always check trail conditions, bring plenty of water, and wear appropriate footwear and sun protection. Let me know if you need any other hiking advice!</assistant>
  </message>
</chatbot>

这个Prompt包含了一个对Assistant的描述,一些关于Human的背景信息,以及一段Human-Assistant的对话历史。

MemGPT使用ChatML构建Prompt的主要优势在于:

统一了不同LLM的Prompt格式,提高了可移植性
使用XML标签来组织Prompt结构,提高了可读性和可维护性
可以方便地嵌入上下文信息,如内存(Memory)、函数调用(FunctionCall)等

在实际应用中,MemGPT会根据需要动态生成ChatML Prompt,例如:

def generate_prompt(human_input: str, agent: Agent) -> str:
    memory_attributes = {
        "human_attributes": agent.memory.human,
        "conversation_summary": agent.memory.summary,
    }
    
    messages = [
        {"human": message.content} 
        for message in agent.history if message.role == "human"
    ] + [{"human": human_input}]
    
    return f"""
    <chatbot>
      <description>{agent.description}</description>
      <memory>{dict_to_xml(memory_attributes)}</memory>
      <conversation>{messages_to_xml(messages)}</conversation>
    </chatbot>
    """

这个函数从Agent的当前状态(如description、memory和history)中提取信息,并将它们格式化为ChatML。最终生成的Prompt会被发送给LLM。

7. 总结与思考

7.1 MemGPT的技术特色与优势

综上所述,MemGPT的一些技术特色和优势包括:

模块化的Agent架构,核心组件如Agent、Memory、Function等都有清晰的抽象
成熟的记忆管理机制,使用CoreMemory、RecallMemory和ArchivalMemory来存储不同层次的知识
灵活的函数化交互方式,允许用Python定义Agent的能力,并自动生成LLM可用的Schema
基于FastAPI的服务化实现,以RESTful API的形式暴露Agent的功能
统一的LLM适配层(llm_api)和Prompt构建语言(ChatML),提高了跨模型的可移植性

这些特色使得MemGPT成为一个功能完备、可扩展、易用的Agant构建框架。

7.2 架构与代码的改进空间

当然,MemGPT还有一些改进的空间,例如:

Agent类的代码比较复杂,包含了状态管理、消息处理、函数调度等多个功能,可以考虑进一步拆分
Memory模块对不同类型的存储后端(如向量数据库、文档数据库等)支持有限,可以抽象出更通用的接口
llm_api模块目前的适配方式较为简单,没有充分考虑LLM的差异(如对话管理、Prompt优化等),可以设计更灵活的适配机制
缺乏细粒度的权限控制和审计日志等安全特性,在实际应用中可能会有风险
部署和监控方面的支持还不够完善,缺少配套的日志、指标、告警等设施

未来,MemGPT可以在这些方面加强,进一步提升其可用性和可靠性。

7.3 对Agent开发的启示

MemGPT为Agent开发提供了一个很好的模板。从它的设计中,我们可以学到以下经验:

Agent = LLM + Memory + Functions: 一个好的Agent,应该具备知识储备(Memory)、推理能力(LLM)和执行能力(Functions)。
结构化的Memory管理: Agent的记忆应该按照不同的时效性和作用,存储在不同的介质中,并提供统一的访问接口。
函数即能力: 将Agent的技能抽象为函数,可以使其更容易扩展和组合。LLM负责调用这些函数,而具体的执行逻辑则由Python代码来实现。
服务化与标准化: 将Agent封装为Web服务,可以使其更容易集成到不同的应用中。同时,使用RESTful等成熟的API设计规范,可以提高互操作性。
框架与平台的分离: MemGPT将Agent的核心逻辑与具体的LLM实现分离,使得它可以适配不同的LLM平台。这种分层的架构具有很好的灵活性。

总之,MemGPT展示了如何从"Agent即服务"的角度来设计一个对话式AI系统。它的很多思路,如Memory管理、函数抽象、服务封装等,都值得我们在构建自己的Agent时借鉴。

当然,Agent开发还有很多其他的课题,如Few-shot Learning、Reinforcement Learning、安全与伦理等,这些都超出了MemGPT的讨论范畴。一个真正智能的Agent,需要在认知、学习、推理、执行等多个层面进行探索和创新。

MemGPT为这一目标提供了一个很好的起点。希望本文的分析能够帮助读者更好地理解其内部机制,并启发大家去设计和构建更强大的Agent系统。让我们一起努力,让智能对话成为未来人机交互的主流形式。

1. 整体架构概览​

1.1 MemGPT的核心模块及其职责​

1.2 模块交互与数据流​

2. Agent的状态管理​

2.1 Agent类的设计与实现​

2.2 PersistenceManager与存储后端​

2.3 AgentState等核心数据结构​

2.4 Agent运行过程中的状态变迁​

3. 记忆管理​

3.1 三种类型Memory的作用与实现​

3.2 向量数据库在RecallMemory与ArchivalMemory中的应用​

3.3 CoreMemory如何参与prompt的构建​

4. 函数化交互​

4.1 函数的定义与自动schema生成​

4.2 函数的注册与调用流程​

4.3 函数在Agent交互中的作用​

5. 服务化​

5.1 RESTful API的设计​

5.2 FastAPI的应用​

5.3 Server核心逻辑​

6. LLM适配​

6.1 不同LLM接口的异同​

6.2 llm_api模块的适配方法​

6.3 ChatML在Agent中的应用​

7. 总结与思考​

7.1 MemGPT的技术特色与优势​

7.2 架构与代码的改进空间​

7.3 对Agent开发的启示​